Towards Symmetric Multimodality: Fusion and Fission of Speech, Gesture, and Facial Expression
نویسنده
چکیده
We introduce the notion of symmetric multimodality for dialogue systems in which all input modes (eg. speech, gesture, facial expression) are also available for output, and vice versa. A dialogue system with symmetric multimodality must not only understand and represent the user's multimodal input, but also its own multimodal output. We present the SmartKom system, that provides full symmetric multimodality in a mixed-initiative dialogue system with an embodied conversational agent. SmartKom represents a new generation of multimodal dialogue systems, that deal not only with simple modality integration and synchronization, but cover the full spectrum of dialogue phenomena that are associated with symmetric multimodality (including crossmodal references, one-anaphora, and backchannelling). We show that SmartKom's plug-anplay architecture supports multiple recognizers for a single modality, eg. the user's speech signal can be processed by three unimodal recognizers in parallel (speech recognition, emotional prosody, boundary prosody). Finally, we detail SmartKom's three-tiered representation of multimodal discourse, consisting of a domain layer, a discourse layer, and a modality layer.
منابع مشابه
SmartKom: Symmetric Multimodality in an Adaptive and Reusable Dialogue Shell
We introduce the notion of symmetric multimodality for dialogue systems in which all input modes (eg. speech, gesture, facial expression) are also available for output, and vice versa. A dialogue system with symmetric multimodality must not only understand and represent the user's multimodal input, but also its own multimodal output. We present the SmartKom system, that provides full symmetric ...
متن کاملDialogue Systems Go Multimodal: The SmartKom Experience
Multimodal dialogue systems exploit one of the major characteristics of humanhuman interaction: the coordinated use of different modalities. Allowing all of the modalities to refer to and depend upon each other is a key to the richness of multimodal communication. We introduce the notion of symmetric multimodality for dialogue systems in which all input modes (e.g., speech, gesture, facial expr...
متن کاملNatural Interactivity Resources - Data, Annotation Schemes and Tools
This paper presents results of three surveys of natural interactivity and multimodal resources carried out by a Working Group in the ISLE project on International Standards for Language Engineering. Information has been collected on a large number of corpora, coding schemes and coding tools world-wide. The paper presents the information collection process, the description and validation methods...
متن کاملTowards Formal Multimodal Analysis of Emotions for Affective Computing
Social robotics is related to the robotic systems and human interaction. Social robots have applications in elderly care, health care, home care, customer service and reception in industrial settings. Human-Robot Interaction (HRI) requires better understanding of human emotion. There are few multimodal fusion systems that integrate limited amount of facial expression, speech and gesture analysi...
متن کاملComputed Ontology-based Situation Awareness of Multi-User Observations
In recent years, we have developed a framework of human-computer interaction that offers recognition of various communication modalities including speech, lip movement, facial expression, handwriting/drawing, gesture, text and visual symbols. The framework allows the rapid construction of a multimodal, multi-device, and multi-user communication system within crisis management. This paper report...
متن کامل